6 research outputs found

    Topological data analysis of human vowels: Persistent homologies across representation spaces

    Full text link
    Topological Data Analysis (TDA) has been successfully used for various tasks in signal/image processing, from visualization to supervised/unsupervised classification. Often, topological characteristics are obtained from persistent homology theory. The standard TDA pipeline starts from the raw signal data or a representation of it. Then, it consists in building a multiscale topological structure on the top of the data using a pre-specified filtration, and finally to compute the topological signature to be further exploited. The commonly used topological signature is a persistent diagram (or transformations of it). Current research discusses the consequences of the many ways to exploit topological signatures, much less often the choice of the filtration, but to the best of our knowledge, the choice of the representation of a signal has not been the subject of any study yet. This paper attempts to provide some answers on the latter problem. To this end, we collected real audio data and built a comparative study to assess the quality of the discriminant information of the topological signatures extracted from three different representation spaces. Each audio signal is represented as i) an embedding of observed data in a higher dimensional space using Taken's representation, ii) a spectrogram viewed as a surface in a 3D ambient space, iii) the set of spectrogram's zeroes. From vowel audio recordings, we use topological signature for three prediction problems: speaker gender, vowel type, and individual. We show that topologically-augmented random forest improves the Out-of-Bag Error (OOB) over solely based Mel-Frequency Cepstral Coefficients (MFCC) for the last two problems. Our results also suggest that the topological information extracted from different signal representations is complementary, and that spectrogram's zeros offers the best improvement for gender prediction

    Detecting human and non-human vocal productions in large scale audio recordings

    Full text link
    We propose an automatic data processing pipeline to extract vocal productions from large-scale natural audio recordings. Through a series of computational steps (windowing, creation of a noise class, data augmentation, re-sampling, transfer learning, Bayesian optimisation), it automatically trains a neural network for detecting various types of natural vocal productions in a noisy data stream without requiring a large sample of labeled data. We test it on two different data sets, one from a group of Guinea baboons recorded from a primate research center and one from human babies recorded at home. The pipeline trains a model on 72 and 77 minutes of labeled audio recordings, with an accuracy of 94.58% and 99.76%. It is then used to process 443 and 174 hours of natural continuous recordings and it creates two new databases of 38.8 and 35.2 hours, respectively. We discuss the strengths and limitations of this approach that can be applied to any massive audio recording

    Detecting non-adjacent dependencies is the exception rather than the rule

    No full text
    International audienceStatistical learning refers to our sensitivity to the distributional properties of our environment. Humans have been shown to readily detect the dependency relationship of events that occur adjacently in a stream of stimuli but processing non-adjacent dependencies (NADs) appears more challenging. In the present study, we tested the ability of human participants to detect NADs in a new Hebb-naming task that has been proposed recently to study regularity detection in a noisy environment. In three experiments, we found that most participants did not manage to extract NADs. These results suggest that the ability to learn NADs in noise is the exception rather than the rule. They provide new information about the limits of statistical learning mechanisms

    Detection of regularities in a random environment

    No full text
    International audienceRegularity detection, or statistical learning, is regarded as a fundamental component of our cognitive system. To test the ability of human participants to detect regularity in a more ecological situation (i.e., mixed with random information), we used a simple letter-naming paradigm in which participants were instructed to name single letters presented one at a time on a computer screen. The regularity consisted of a triplet of letters that were systematically presented in that order. Participants were not told about the presence of this regularity. A variable number of random letters were presented between two repetitions of the regular triplet, making this paradigm similar to a Hebb repetition task. Hence, in this Hebb-naming task, we predicted that if any learning of the triplet occurred, naming times for the predictable letters in the triplet would decrease as the number of triplet repetitions increased. Surprisingly, across four experiments, detection of the regularity only occurred under very specific experimental conditions and was far from a trivial task. Our study provides new evidence regarding the limits of statistical learning and the critical role of contextual information in the detection (or not) of repeated patterns

    Detection and classification of vocal productions in large scale audio recordings

    No full text
    We propose an automatic data processing pipeline to extract vocal productions from large-scale natural audio recordings and classify these vocal productions. The pipeline is based on a deep neural network and adresses both issues simultaneously. Though a series of computationel steps (windowing, creation of a noise class, data augmentation, re-sampling, transfer learning, Bayesian optimisation), it automatically trains a neural network without requiring a large sample of labeled data and important computing resources. Our end-to-end methodology can handle noisy recordings made under different recording conditions. We test it on two different natural audio data sets, one from a group of Guinea baboons recorded from a primate research center and one from human babies recorded at home. The pipeline trains a model on 72 and 77 minutes of labeled audio recordings, with an accuracy of 94.58% and 99.76%. It is then used to process 443 and 174 hours of natural continuous recordings and it creates two new databases of 38.8 and 35.2 hours, respectively. We discuss the strengths and limitations of this approach that can be applied to any massive audio recording

    Learning Higher‐Order Transitional Probabilities in Nonhuman Primates

    No full text
    International audienceThe extraction of cooccurrences between two events, A and B, is a central learning mechanism shared by all species capable of associative learning. Formally, the cooccurrence of events A and B appearing in a sequence is measured by the transitional probability (TP) between these events, and it corresponds to the probability of the second stimulus given the first (i.e., p(B|A)). In the present study, nonhuman primates (Guinea baboons, Papio papio) were exposed to a serial version of the XOR (i.e., exclusive-OR), in which they had to process sequences of three stimuli: A, B, and C. In this manipulation, first-order TPs (i.e., AB and BC) were uninformative due to their transitional probabilities being equal to .5 (i.e., p(B|A) = p(C|B) = .5), while secondorder TPs were fully predictive of the upcoming stimulus (i.e., p(C|AB) = 1). In Experiment 1, we found that baboons were able to learn second-order TPs, while no learning occurred on first-order TPs. In Experiment 2, this pattern of results was replicated, and a final test ruled out an alternative interpretation in terms of proximity to the reward. These results indicate that a non-human primate species can learn a nonlinearly separable problem such as the XOR. They also provide fine-grained empirical data to test models of statistical learning on the interaction between the learning of different orders of TPs. Recent bioinspired models of associative learning are also introduced as promising alternatives to the modeling of statistical learning mechanisms
    corecore